component assignment
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Nevada (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.41)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.41)
A Bayesian Nonparametrics View into Deep RepresentationsSupplementary material A Collapsed Gibbs Sampling for DP-GMM
Here we describe CGS in more details. Eqn. 10 we obtain: null null Expression under the last integral in Eqn. 13 is tractable, thanks to the conjugacy of the Normal-inverse-Wishart prior to the Gaussian likelihood. Finally, posterior predictive density (10) can be written as a mixture of multivariate Student's CIFAR experiments used the standard train/test split. Results for architectures not included in Section 4 are summarized in Fig. C.1. Table C.1: CNN architectures used in experiments (Section 4).
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Nevada (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.41)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.41)
Mixture models for data with unknown distributions
We describe and analyze a broad class of mixture models for real-valued multivariate data in which the probability density of observations within each component of the model is represented as an arbitrary combination of basis functions. Fits to these models give us a way to cluster data with distributions of unknown form, including strongly non-Gaussian or multimodal distributions, and return both a division of the data and an estimate of the distributions, effectively performing clustering and density estimation within each cluster at the same time. We describe two fitting methods, one using an expectation-maximization (EM) algorithm and the other a Bayesian non-parametric method using a collapsed Gibbs sampler. The former is numerically efficient, but gives only point estimates of the probability densities. The latter is more computationally demanding but returns a full Bayesian posterior and also an estimate of the number of components. We demonstrate our methods with a selection of illustrative applications and give code implementing both algorithms.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Fast sampling and model selection for Bayesian mixture models
We describe two Monte Carlo algorithms for sampling from the integrated posterior distributions of a range of Bayesian mixture models. Both algorithms allow us to directly sample not only the assignment of observations to components but also the number of components, thereby fitting the model and performing model selection over the number of components in a single computation. The first algorithm is a traditional collapsed Gibbs sampler, albeit with an unusual move-set; the second builds on the first, adding rejection-free sampling from the prior over component assignments, to create an algorithm that has excellent mixing time in typical applications and outperforms current state-of-the-art methods, in some cases by a wide margin. We demonstrate our methods with a selection of applications to latent class analysis.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > New York (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- (8 more...)
- Law (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Boltzmann Exploration Expectation-Maximisation
We present a general method for fitting finite mixture models (FMM). Learning in a mixture model consists of finding the most likely cluster assignment for each data-point, as well as finding the parameters of the clusters themselves. In many mixture models, this is difficult with current learning methods, where the most common approach is to employ monotone learning algorithms e.g. the conventional expectation-maximisation algorithm. While effective, the success of any monotone algorithm is crucially dependant on good parameter initialisation, where a common choice is $K$-means initialisation, commonly employed for Gaussian mixture models. For other types of mixture models, the path to good initialisation parameters is often unclear and may require a problem-specific solution. To this end, we propose a general heuristic learning algorithm that utilises Boltzmann exploration to assign each observation to a specific base distribution within the mixture model, which we call Boltzmann exploration expectation-maximisation (BEEM). With BEEM, hard assignments allow straight forward parameter learning for each base distribution by conditioning only on its assigned observations. Consequently, it can be applied to mixtures of any base distribution where single component parameter learning is tractable. The stochastic learning procedure is able to escape local optima and is thus insensitive to parameter initialisation. We show competitive performance on a number of synthetic benchmark cases as well as on real-world datasets.
- North America > United States > Ohio (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)